Skip to content

Conversation

@yuandrew
Copy link
Contributor

@yuandrew yuandrew commented Oct 14, 2025

What was changed

Send plugin names over to core for worker heartbeating.

Updated Core to latest main, 9e9a461.

Updated test to validate replacing clients with a client from a different runtime is invalid.

Why?

Worker heartbeating

Checklist

  1. Closes [Feature Request] Enable Worker Heartbeating #1196

  2. How was this tested:

Added tests for plugin name propogation, and runtime options configuration

  1. Any docs updates needed?

Note

Propagates worker plugin names to Core (via heartbeats), adds runtime worker-heartbeat options, migrates bridge to new temporalio_* crates, updates generated APIs/protos (cloud/workflowservice), and adjusts Python worker/runtime and tests accordingly.

  • Bridge/Core Migration:
    • Move from temporal_client/temporal-sdk-core* to temporalio_client/temporalio_common/temporalio_sdk_core crates; update deps (prost/tonic/opentelemetry).
    • init_runtime now takes RuntimeOptions (includes worker_heartbeat_interval_millis).
    • Add worker task-type selection and plugin list passthrough; replace_client returns error on failure.
    • Extend RPC routing to new methods (describe_worker, set_worker_deployment_manager, etc.).
  • Python Runtime/Worker:
    • Runtime accepts worker_heartbeat_interval; updates logging filter targets.
    • Worker config adds WorkerTaskTypes and plugins; forward plugin names to Core.
    • Client exposes plugins list; replayer sets task types.
  • APIs/Protos:
    • Cloud: add AuditLogSinkSpec, SetServiceAccountNamespaceAccess, ValidateAccountAuditLogSink; sink types KinesisSpec, PubSubSpec.
    • WorkflowService: add DescribeWorker, SetWorkerDeploymentManager; extend start/get-cluster/worker-deployment requests; minor doc tweaks.
    • Namespace/Deployment messages gain new fields (capabilities, manager_identity); activation adds last_sdk_version.
  • Tests/CI:
    • New tests for plugin propagation and runtime options; disallow client replacement across runtimes; adjust worker tests.
    • CI: minor lint step order change; scripts adjust proto paths and module rename.

Written by Cursor Bugbot for commit d342c8d. This will update automatically on new commits. Configure here.

@yuandrew yuandrew marked this pull request as ready for review October 21, 2025 01:04
@yuandrew yuandrew requested a review from a team as a code owner October 21, 2025 01:04
)
.nexus_task_poller_behavior(conf.nexus_task_poller_behavior)
.plugins(
conf.plugins
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, something that will need some discussion here. This PR reports worker plugins. We should discuss whether that is really what we intend. Plugins which exist only in the client and not the worker will be completely invisible. That could potentially be changed here at least for plugins in clients used by workers, though not generally for any client. I think that was something we didn't really think through when we decided to go with heartbeat as a carrier for this information, but maybe we conclude that is fine.

Typescript will be a bit more complicated as well with its additional plugin types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, we will for now have both worker and client plugins and dedup names

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me, only thing is the default interval

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Runtime Comparison Needs Proper Instance Resolution

The runtime comparison uses identity (is not) but doesn't properly handle when bridge_client.config.runtime is None. When the original client uses the default runtime (via None), self._runtime stores the actual default runtime object, but if a new client also has runtime=None in its config, the comparison self._runtime is not None incorrectly raises an error even though both clients use the same default runtime. The comparison should resolve both sides to actual runtime instances before comparing.

temporalio/worker/_worker.py#L648-L652

bridge_client = _extract_bridge_client_for_worker(value)
if self._runtime is not bridge_client.config.runtime:
raise ValueError(
"New client is not on the same runtime as the existing client"
)

Fix in Cursor Fix in Web


py: Python<'p>,
call: RpcCall,
) -> PyResult<Bound<'p, PyAny>> {
use temporal_client::WorkflowService;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a code generated file, you need to change the generator.

"temporalio_sdk",
]
parts = [self.other_level]
parts.extend(f"{target}={self.core_level}" for target in targets)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Outdated Telemetry Target Breaks Rust Log Filtering

The TelemetryFilter.formatted() method includes "temporalio_sdk" as a target, but this doesn't match any actual Rust crate name. The crates were renamed from temporal_sdk_core, temporal_client, etc. to temporalio_sdk_core, temporalio_client, and temporalio_common. The target "temporalio_sdk" will never match any log output from the Rust code, preventing those logs from being filtered at the configured level. This should likely be "temporalio_sdk_core" (which is already in the list) or removed entirely.

Fix in Cursor Fix in Web

- run: uv sync --all-extras
- run: poe lint
- run: poe build-develop
- run: poe lint
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Misleading artifact name for test job.

The artifact name includes --time-skipping but the test-latest-deps job doesn't actually run any time-skipping tests. The job only executes poe test -s --junit-xml=junit-xml/latest-deps.xml, so the artifact name is misleading and doesn't match the actual test content. This could cause confusion when reviewing test results.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Enable Worker Heartbeating

4 participants